Prosody-Dependent Acoustic Modeling Using Variable-Parameter Hidden Markov Models
نویسندگان
چکیده
As an effort to make prosody useful in spontaneous speech recognition, we adopt a quasi-continuous prosodic annotation and accordingly design a prosody-dependent acoustic model to improve ASR performances. We propose a variable-parameter Hidden Markov Models, modeling the mean vector as a function of the prosody variable through a polynomial regression model. The prosodically-adapted acoustic models are used to re-score the N-best output from a standard ASR, according to the prosody variable assigned by an automatic prosody detector. Experiments on the Buckeye corpus demonstrate the effectiveness of our approach.
منابع مشابه
An Intonational Phrase Boundary and Pitch Accent Dependent Speech Recognizer
Does prosody help word recognition? In this paper, we propose a novel probabilistic framework in which word and phoneme are dependent on prosody in a way that improves word recognition. We describe the idea of prosody dependent speech recognition by building a prosody dependent speech recognizer that conditions word and phoneme models on two important prosodic variables: intonational phrase bou...
متن کاملA general-purpose 32 ms prosodic vector for hidden Markov modeling
Prosody plays a central role in conversation, making it important for speech technologies to model. Unfortunately, the application of standard modeling techniques to the acoustics of prosody has been hindered by difficulties in modeling intonation. In this work, we explore the suitability of the recently introduced fundamental frequency variation (FFV) spectrum as a candidate general representa...
متن کاملSpeech prosody in phonetics and technology
As features unique to spoken language, speech prosody plays an important role in human communication. Although the acoustic features of speech are viewed most frequently in a frame-byframe manner, this is not always appropriate for prosodic features, since they are tightly related to higher level linguistic information, such as syntactic and discourse structures, and spread to wide time spans, ...
متن کاملProsody-dependent Acoustic Modeling for Mandarin Speech Recognition
A study on introducing prosodic information to acoustic modeling (AM) for speech recognition is reported in this paper. It extends the conventional context-dependent (CD) triphone HMM modeling approach to further consider the dependency of phone model on the break type of nearby inter-syllable boundary. Four break types are considered, including major break, minor break, normal non-break, and t...
متن کاملProsody recognition from speech utterances using acoustic and linguistic based models of prosodic events
A system for automatic recognition of prosodic events in speech utterances has been developed and applied to recognizing accent tones as de ned by the tone and break index (ToBI) prosodic labeling standard. Both the acoustic and syntactic modeling portions of the system are described in the paper. The acoustic modeling portion of the system involves representation of ToBI labeled events using h...
متن کامل